Serveur d'exploration sur SGML

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Automatic Tagging of Compound Verb Groups in Czech Corpora

Identifieur interne : 001C01 ( Main/Exploration ); précédent : 001C00; suivant : 001C02

Automatic Tagging of Compound Verb Groups in Czech Corpora

Auteurs : Eva Žá Ková [République tchèque] ; Luboš Popelínsk [République tchèque] ; Miloslav Nepil [République tchèque]

Source :

RBID : ISTEX:5986FD9D5B5AAF48236DA0482A5B726C251FF4C2

Abstract

Abstract: In Czech corpora, compound verb groups are usually tagged in a word-by-word manner. As a consequence, some of the morphological tags of particular components of the verb group loose their original meaning. We present an improved method for automatic synthesis of verb rules. These rules describe all compound verb groups that are frequent in Czech. Using these rules, we can find compound verb groups in unannotated texts with high accuracy. The system for tagging compound verb groups in an annotated corpus that exploits the verb rules is described.

Url:
DOI: 10.1007/3-540-45323-7_20


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Automatic Tagging of Compound Verb Groups in Czech Corpora</title>
<author>
<name sortKey="Za Kova, Eva" sort="Za Kova, Eva" uniqKey="Za Kova E" first="Eva" last="Žá Ková">Eva Žá Ková</name>
</author>
<author>
<name sortKey="Popelinsk, Lubos" sort="Popelinsk, Lubos" uniqKey="Popelinsk L" first="Luboš" last="Popelínsk">Luboš Popelínsk</name>
</author>
<author>
<name sortKey="Nepil, Miloslav" sort="Nepil, Miloslav" uniqKey="Nepil M" first="Miloslav" last="Nepil">Miloslav Nepil</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:5986FD9D5B5AAF48236DA0482A5B726C251FF4C2</idno>
<date when="2000" year="2000">2000</date>
<idno type="doi">10.1007/3-540-45323-7_20</idno>
<idno type="url">https://api.istex.fr/ark:/67375/HCB-FZCR209N-S/fulltext.pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001778</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">001778</idno>
<idno type="wicri:Area/Istex/Curation">001276</idno>
<idno type="wicri:Area/Istex/Checkpoint">001A30</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">001A30</idno>
<idno type="wicri:doubleKey">0302-9743:2000:Za Kova E:automatic:tagging:of</idno>
<idno type="wicri:Area/Main/Merge">001C45</idno>
<idno type="wicri:Area/Main/Curation">001C01</idno>
<idno type="wicri:Area/Main/Exploration">001C01</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Automatic Tagging of Compound Verb Groups in Czech Corpora</title>
<author>
<name sortKey="Za Kova, Eva" sort="Za Kova, Eva" uniqKey="Za Kova E" first="Eva" last="Žá Ková">Eva Žá Ková</name>
<affiliation wicri:level="3">
<country xml:lang="fr">République tchèque</country>
<wicri:regionArea>NLP Laboratory, Faculty of Informatics, Masaryk University, Botanická 68, CZ-60200, Brno</wicri:regionArea>
<placeName>
<settlement type="city">Brno</settlement>
<region>Moravie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">République tchèque</country>
</affiliation>
</author>
<author>
<name sortKey="Popelinsk, Lubos" sort="Popelinsk, Lubos" uniqKey="Popelinsk L" first="Luboš" last="Popelínsk">Luboš Popelínsk</name>
<affiliation wicri:level="3">
<country xml:lang="fr">République tchèque</country>
<wicri:regionArea>NLP Laboratory, Faculty of Informatics, Masaryk University, Botanická 68, CZ-60200, Brno</wicri:regionArea>
<placeName>
<settlement type="city">Brno</settlement>
<region>Moravie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">République tchèque</country>
</affiliation>
</author>
<author>
<name sortKey="Nepil, Miloslav" sort="Nepil, Miloslav" uniqKey="Nepil M" first="Miloslav" last="Nepil">Miloslav Nepil</name>
<affiliation wicri:level="3">
<country xml:lang="fr">République tchèque</country>
<wicri:regionArea>NLP Laboratory, Faculty of Informatics, Masaryk University, Botanická 68, CZ-60200, Brno</wicri:regionArea>
<placeName>
<settlement type="city">Brno</settlement>
<region>Moravie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">République tchèque</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s" type="main" xml:lang="en">Lecture Notes in Computer Science</title>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: In Czech corpora, compound verb groups are usually tagged in a word-by-word manner. As a consequence, some of the morphological tags of particular components of the verb group loose their original meaning. We present an improved method for automatic synthesis of verb rules. These rules describe all compound verb groups that are frequent in Czech. Using these rules, we can find compound verb groups in unannotated texts with high accuracy. The system for tagging compound verb groups in an annotated corpus that exploits the verb rules is described.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>République tchèque</li>
</country>
<region>
<li>Moravie</li>
</region>
<settlement>
<li>Brno</li>
</settlement>
</list>
<tree>
<country name="République tchèque">
<region name="Moravie">
<name sortKey="Za Kova, Eva" sort="Za Kova, Eva" uniqKey="Za Kova E" first="Eva" last="Žá Ková">Eva Žá Ková</name>
</region>
<name sortKey="Nepil, Miloslav" sort="Nepil, Miloslav" uniqKey="Nepil M" first="Miloslav" last="Nepil">Miloslav Nepil</name>
<name sortKey="Nepil, Miloslav" sort="Nepil, Miloslav" uniqKey="Nepil M" first="Miloslav" last="Nepil">Miloslav Nepil</name>
<name sortKey="Popelinsk, Lubos" sort="Popelinsk, Lubos" uniqKey="Popelinsk L" first="Luboš" last="Popelínsk">Luboš Popelínsk</name>
<name sortKey="Popelinsk, Lubos" sort="Popelinsk, Lubos" uniqKey="Popelinsk L" first="Luboš" last="Popelínsk">Luboš Popelínsk</name>
<name sortKey="Za Kova, Eva" sort="Za Kova, Eva" uniqKey="Za Kova E" first="Eva" last="Žá Ková">Eva Žá Ková</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Informatique/explor/SgmlV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001C01 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001C01 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Informatique
   |area=    SgmlV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:5986FD9D5B5AAF48236DA0482A5B726C251FF4C2
   |texte=   Automatic Tagging of Compound Verb Groups in Czech Corpora
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jul 1 14:26:08 2019. Site generation: Wed Apr 28 21:40:44 2021